Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check DoNotEvict after filtering evictable pods to ensure termination can complete. #1294

Merged
merged 6 commits into from
Feb 9, 2022

Conversation

ellistarn
Copy link
Contributor

@ellistarn ellistarn commented Feb 8, 2022

1. Issue, if available:
#1166

2. Description of changes:
Previously, thedo-not-evict annotation could block termination logic, even if the node was unreachable. This change modifies the do-not-evict logic to only apply to evictable pods. Currently, evictable pods exclude

  • Pods owned by the node (a.k.a. static pods)
  • Pods with deletion time stamp set past their graceful termination period (a.k.a. stuck termination)
  • Pods that tolerate the nodes taints (i.e. would reschedule immediately)

This means that the do-not-evict annotation will no longer work for pods in the above categories. Given that we already did not attempt to evict these pods, the semantic holds. However, previously, pods in these categories with the do-not-evict annotation could block node termination.

3. How was this change tested?

Reproduced

  1. Added do not evict annotation to pod
  2. Manually terminated a node in EC2
  3. Observe Node and Pod stuck Terminating

Fixed

  1. Start with step 3 of above scenario
  2. Apply PR changes to local environment
  3. Node is gracefully terminated

4. Does this change impact docs?

  • Yes, PR includes docs updates
  • Yes, issue opened: link to issue
  • No

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@netlify
Copy link

netlify bot commented Feb 8, 2022

✔️ Deploy Preview for karpenter-docs-prod canceled.

🔨 Explore the source changes: 7d00e1e

🔍 Inspect the deploy log: https://app.netlify.com/sites/karpenter-docs-prod/deploys/6203106ef4a6030008a3d54e

pkg/controllers/termination/eviction.go Show resolved Hide resolved
pkg/controllers/termination/terminate.go Outdated Show resolved Hide resolved
website/content/en/preview/tasks/deprovisioning.md Outdated Show resolved Hide resolved
pkg/controllers/termination/terminate.go Outdated Show resolved Hide resolved
pkg/controllers/termination/terminate.go Outdated Show resolved Hide resolved
pkg/controllers/termination/terminate.go Outdated Show resolved Hide resolved
website/content/en/preview/tasks/deprovisioning.md Outdated Show resolved Hide resolved
pod := test.Pod(test.PodOptions{
NodeName: node.Name,
ObjectMeta: metav1.ObjectMeta{Annotations: map[string]string{v1alpha5.DoNotEvictPodAnnotationKey: "true"}},
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider testing static pods and a pod that tolerates NoSchedule, should just be two more pods creation lines, and adding to the respective lines below.

njtran
njtran previously approved these changes Feb 9, 2022
Copy link
Contributor

@njtran njtran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix! Looks good!

@ellistarn ellistarn merged commit 5856e38 into aws:main Feb 9, 2022
@ellistarn ellistarn deleted the evict branch February 9, 2022 01:35
AndrewSirenko added a commit to AndrewSirenko/karpenter-provider-aws that referenced this pull request Aug 1, 2024
AndrewSirenko added a commit to AndrewSirenko/karpenter-provider-aws that referenced this pull request Aug 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants